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COVALENT CHEMICAL INDUCERS OF PROTEIN DIMERIZATION 
AND THEIR USES IN HIGH THROUGHPUT BINDING SCREENS 

This invention has been made with government support under 
National Science Foundation grants CHE-9626981, CHE-9977402, and 
CHE-9984928. Accordingly, the U.S. Government has certain 
rights in the invention. 

Throughout this application, various publications are referenced 
by author or author and date. Full citations for these 
publications may be found listed alphabetically at the end of 
the specification immediately preceding the claims. The 
disclosures of these publications in their entireties are hereby 
incorporated by reference into this application in order to more 
fully describe the state of the art as known to those skilled 
therein as of the date of the invention described and claimed 
herein. 

Field of Invention 

This invention relates to high throughput screening of cDNA 
libraries . 

Background of the Invention 

The majority of known proteins were identified using traditional 
genetics or biochemistry. The availability of complete genome 
sequences for several organisms and the anticipation of the 
completion of the human genome project effectively make 
thousands of new proteins "known". The problem, however, is 
that while thousands of new open reading frames (ORFs) have been 
identified, the functions of these proteins remain a mystery. 
Sequence analysis is a powerful predictor of protein function, 
but many ORFs cannot be assigned by sequence analysis and 
experimental characterization is still required to ascertain 
protein function- There is tremendous interest in high- 
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throughput approaches for testing cDNA libraries, which include 
thousands of unique ORFs, using genetic or biochemical screens. 
The hurdles are the same as in all high-throughput screening 
applications. First, cDNA libraries must be available in a 
format that is compatible with screening technologies and that 
allows rapid identification of individual cDNAs . Second, high- 
throughput screens must be developed. 

Commercial cDNA libraries, tissue-specific cDNA libraries, and 
even cell-cycle-specific cDNA libraries from a variety of 
organisms are readily available* Over the past few years, these 
cDNA libraries have been adapted to several formats amenable to 
high-throughput screening. 

Expression cloning relies on split-pool in vitro 
transcription/translation and has the advantage that it is 
compatible with many traditional biochemical assays. Winzler et 
al. used homologous recombination to engineer 2026 unique yeast 
strains — each containing a knock-out of a different ORF and 
replacing that ORF with a unique 20 base-pair tag. Several 
laboratories have reported specific yeast two-hybrid cDNA 
libraries, and many of these libraries are even distributed 
commercially (Clontech) . Martzen et al. constructed 6144 
individual yeast strains where each strain expresses a unique 
S, cerevisiae ORF-GST fusion protein under control of the P CUP1 
promoter. Because of the facility of homologous recombination 
in S„ cerevisiae, these cDNA libraries were prepared simply by 
co-transforming the cDNA library with the appropriate linearized 
vector. Thus, replicating these expression formats with 
different cDNA libraries is routine. 

The most common traditional genetic selection is lethality, or 
synthetic-lethality, A variety of phenotype-specif ic screens 
have also been employed. However, most of these -are too time 
consuming for screening cDNA libraries. A few phenotype- 
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specific selections have been reported. Screens and selections 
designed for the high-throughput screening of cDNA libraries 
have also been reported. One of the major applications 
envisioned for the yeast two-hybrid assay is the screening of 
5 cDNA libraries for protein-protein interactions. The success 
of the yeast three-hybrid assay suggests that it should also be 
possible to screen cDNA libraries for small-molecule-protein 
interactions. Another approach is to screen for changes in 
expression levels of individual cellular RNAs . In Genetic 

10 Footprint in.g, random Tyl transposon insertions in genomic DMA 
are used as markers for changes in the expression levels of 
endogenous RNAs based on reverse transcription and gel 
electrophoresis. The use of unique oligonucleotide tags rather 
than Tyl transposon insertions facilitates rapid identification 

15 of individual RNAs. DNA microarrays, in which oligonucleotides 
corresponding to each individual ORF are synthesized on chips 
in a spatially-resolved format, have been used successfully in 
a number of recent applications. A recent report in which 
expression cloning identified a new family of uracil-DNA 

20 glycosylases from a Xenopus cDNA library based on in vitro 
binding assays suggests the importance of screening based on 
biochemical activity. 

WO 01/53355 describes a number of screening approaches, 
25 including the use of small molecules to induce protein 
dimerization to screen cDNA libraries based on binding, or small 
molecules with cleavable linkers to screen cDNA libraries based 
on catalysis. The CID technology offers a promising approach 
to screening cDNA libraries based on function because a variety 
30 of activities can be assayed simply by changing on of the CID 
ligand/receptor pairs or by changing the bond between the CID 
ligands . 

However, the existing CID approaches rely on 4 non-covalent 
35 interactions taking place simultaneously for the reporter 
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protein to foe activated. Specifically, 1) the DNA-binding 
protein-DNA interaction, 2) the l sc ligand-receptor interaction, 
3) the 2 nd ligand-receptor interaction, and 4) the activation 
domain-transcription machinery interaction. This approach is 
useful in certain types of screens. 

An approach not employed by the reported CID screens is making 
a system with only 3 non-covalent interactions, yet still 
employing a small molecule as the CID. 
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Summary of the Invention 

An embodiment of this invention are compounds having the 
formula: 

H1-Y-H2 

where HI is a substrate capable of selectively binding to a 
first receptor; where H2 is a substrate capable of selectively 
binding to and selectively forming a covalent bond with a second 
receptor; and wherein Y is a moiety providing a covalent linkage 
between HI and H2, which may be present or absent, and when 
absent, HI is covalently linked to H2. Also described are cells 
for use with the compounds for in vivo screening of compounds 
and proteins. 

In this compound, the 1 st ligand-receptor pair (Dex-GR in Figure 
13) is replaced with a small molecule-receptor pair that will 
form an irreversible covalent linkage, making a system with only 
3 non-covalent interactions. Such an approach allows for the 
screening of small molecules to identify their cellular targets. 
This covalent CID system is used for screening the ligand 
receptor interaction, which used to be laborious work by using 
the photo cross linking, radio labeled ligand binding and 
affinity chromatography techniques. The covalent system is more 
sensitive than the Dex~Mtx system because the covalent bond 
gives zero k off for the covalent ligand-protein binding pair and 
then the cut-off Kd of the whole system is enhanced. 
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Description of the Figures 

Figure 1. The selection strategy. Proteins V and W do not 
interact (A) until a BOND links the handles HI and H2 (B) . The 
selection can be run in the forward direction to select for BOND 
5 formation or the reverse direction to select for BOND cleavage. 

Figure 2. The yeast three-hybrid system. The small molecule 
dexamethasone-FK506 (H1-H2) mediates the dimerization of the 
LexA-GR (glucocorticoid receptor) and B42-FKBP12 protein 
10 fusions. Dimerization of the DNA-binding protein LexA and the 
activation domain B42 activates transcription of the lacZ 
reporter gene. 

Figure 3, The Model reaction. Cephalosporin hydrolysis by the 
15 908R cephalosporinase . 

Figure 4. DEX-CEPHEM- FK506 retrosynthesis . Cephetn 1 is 
commercially available. DEX~C0 2 H is prepared via oxidation of 
the C 20 ^-hydroxy ketone; FK506-CO 2 H, via a cross-metathesis 
20 reaction with the C 2i allyl group* 

Figure 5. The chemical handles dexamethasone (A) , FK506 (B) , 
and methotrexate (C) . 

25 Figure 6, The dexamethasone-methotrexate molecules synthesized. 
The diamine linkers are commercially available and vary in 
length and hydrophobicity . 

Figure 7. The Claisen rearrangement (A) and the Diels-Alder 
30 reaction (B) are both pericyclic reactions with six-membered 
transition states. 

Figure 8, The retro-synthesis of the diene (A) and the 
dienophile (B) . A Curtius rearrangement is used to introduce 
35 the carbamyl linkage to HI in the diene. (Overman) A Stille 
coupling is used to introduce the alkyl linkage to H2 in the 
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ciienophile. (Duchene) The cyclohexene product will be prepared 
through the cycloaddition of these two compounds. 

Figure 9, Examples of DEX-DEX molecules synthesized to date. 

Figure 10, DEX-MTX retrosynthesis - 

Figure 11. Maps of the plasmids encoding the LexA-GR and B42-GR 
fusion proteins. 

Figure 12, Dex-cephem-Mtx retro-synthesis. 



Figure 13. Dex-Mtx protein dimerization system. A cell- 
permeable Dex-Mtx molecule is used to induce dimerization of 
15 LexA-GR and DHFR-B42 protein chimeras, activating transcription 
of a lacZ reporter gene* 

Figure 14. Cell based assays. Yeast cells containing LexA-GR 
and b42-DHFR fusion proteins and the lacZ reporter gene are 
20 grown on X-gal plates with or without Dex-Mtx. Dex-Mtx 
dimerizes the fusion proteins, activating lacZ transcription, 
hydrolyzing the chromogenic substrate X-gal, and turning the 
cells blue. Dex-Mtx is added directly to the media in the x-gal 
plate. The assay takes two to five days. 

25 

Figure 15. X-gal plate assay of Dex-cephem-Mtx induced lacZ 
transcription . Yeast strains containing different LexA- and B4 2 
chimeras, plus a lacZ reporter gene, were grown on X-gal 
indicator plates with or without Dex-cephem-MTX compounds: A, 

30 1 uM Dex-MTX; B, 10 \xM Dex-cephem-MTX; C, no small molecule. 
The strains that are dark (blue in original) even in the absence 
of small molecule (plate C) are positive controls on protein- 
protein interaction. The dark strains on plates A and B express 
LexA DHFR and B42-GR fusion proteins, and the white strains are 

35 negative controls, expressing only LexA and B42. 



V 
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Figure 16A. Plate BTC4 grown on 4 different plates after 72 
hours. One plate has no small molecule, so just the positive 
controls should be dark. The other three plates all have either 
10 uM DM1, 10 uM D8M, or 10 uM D10M. Figure 16B is the plate 
5 map for plate BTC4. 

Figure 17A. Plate BTC6 grown on 4 plates after 56 hours. Two 
top plates contain no small molecule, and the bottom two plates 
contain 10 uM D10M. Figure 17B shows plate BTC6 grown on 2 
10 plates after 60 hours. Both plates contain 1 uM D8M. Figure 17C 
shows the plate map for plate BTC6 . 

Figure 18, The |3-galactosidase activity of strain V494Y using 
varying concentrations of D8M . 

15 

Figure 19. A screen for glycosidase activity. Dex-Mtx CIDs with 
cleavable oligosaccharide linkers used to assay the >3000 
proteins in S. cerevisiae of unknown function for glycosidase 
activity. A yeast cDNA library is introduced into the selection 
20 strain. Only cells expressing active glycosidases cleave the 
oligosaccharide linker, disrupt ura3 transcription, and survive 
in the presence of 5-FOA. 

Figure 20. Proposed solid-phase synthesis of the Dex-Mtx 
25 glycosidase substrates. While the synthesis of Dex- (GlcNAc) A - 
Mtx is shown, the synthesis is designed to allow the 
introduction of a variety of sugar monomers with both regio- and 
stereo-control . 



30 



Figure 21. The synthesis route of Mtx-Cephem. 



WO 02/059272 



PCT/US02/02.199 



.9. 

Detailed Description of the Invention 

An embodiment of the invention is a compound having the formula: 
H1-Y-H2 

wherein HI is a substrate capable of selectively binding 
to a first receptor; 

wherein H2 is a substrate capable of selectively binding . 
to and selectively forming a covalent bond with a second 
receptor; and 

wherein Y is a moiety providing a covalent linkage between 
HI and H2, which may be present or absent, and when absent, Hi 
is covalently linked to H2. 

HI may be a Methotrexate moiety or an analog thereof. 

H2 may be a cephem moiety capable of selectively binding to and 
selectively forming a covalent bond with the penicillin-binding- 
protein ("PBP"). H2 may alternatively be a fluorouracil moiety 
capable of selectively binding to and selectively forming a 
covalent bond with the thymidine synthase ("TS") enzyme. 

The compound may have the structure: 




Another embodiment of the invention is a complex between the 
compound and a fusion protein, the fusion protein comprising a 
receptor domain which binds to the compound. 

The fusion protein may further comprise a DNA-binding domain 
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fused to the receptor domain, or a transcription activation 
domain fused to the receptor domain- 

In the complex, the receptor domain may be dihydrof olate 
5 reductase ("DHFR"), penicillin-binding-protein ("PBP"), or 
thymidine synthase ("TS") enzyme. The PBP may be the 
Streptomyces R61 PBP . The DHFR may be the E.coli DHFR 
( "eDHFR" ) . 

10 The fusion protein in the complex may be eDHFR-LexA, R61-LexA, 
eDHFR~B42 or R61-B42. 

Also described is cell comprising the complex. 

15 The cell may comprise a DNA sequence which on transcription 
gives rise to a first fusion protein exogenous to the cell and 
a second fusion protein exogenous to the cell, 

wherein the first fusion protein is a receptor domain fused 
-with a DNA~binding domain; and 

20 wherein the second fusion protein is a transcription 

activation domain fused to either a penicillin-binding-protein 
("PBP") or to a thymidine synthase ("TS") enzyme. 

In the cell, the receptor domain of the first fusion protein may 
25 be DHFR; the DNA-binding domain of the first fusion protein may 
be LexA; the transcription activation domain of the second 
fusion protein may be B42, In the cell, the PBP may be the 
Streptomyces R61 PBP. 

30 The first fusion protein in the cell may be eDHFR-LexA, and the 
second fusion protein may be R61-B42. 

The cell may be a yeast cell, a bacteria cell or a mammalian 
cell. In an embodiment, the cell is S. cerevisiae or E, col in 

35 
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Also disclosed is a method of dimerizing two fusion proteins 
inside a cell using the compound, the method comprising the 
steps of 

a) providing a cell that expresses a first fusion protein 
which comprises a receptor domain that binds to HI, and a second 
fusion protein which comprises a receptor domain that binds to 
and forms a covalent bond with H2, and 

b) contacting the compound with the cell so as to dimerize 
the two fusion proteins. 

In the method, the receptor domain of the first fusion protein 
may be DHFR; the DNA-binding domain of the first fusion protein 
may be Lex A; the transcription activation domain of the second 
fusion protein may be B42; the receptor domain of the second 
fusion protein may be a penicillin-binding-protein ("PBP") or 
to a thymidine synthase ("TS") enzyme. The PBP may be the 
Streptomyces R61 PBP. 

In an embodiment of the method, the first fusion protein is 
eDHFR-LexA, and the second fusion protein is R61-B42. 

Also disclosed is a method for identifying a molecule that binds 
a known target in a cell from a pool of candidate molecules, 
comprising: 

■ (a) forming a screening molecule by covalently bonding each 
molecule in the pool of candidate molecules to a substrate 
capable of selectively binding to and selectively forming a 
covalent bond with a receptor; 

(b) introducing the screening molecule into a cell culture 
comprising cells that express 

a first fusion protein of a DNA-binding domain fused 
to a known target receptor domain against which the 
candidate molecule is screened, 

a second fusion protein which comprises a receptor 
domain capable of binding to and forming a covalent bond 
with the screening molecule, and 
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a reporter gene wherein expression of the reporter 
gene is conditioned on the proximity of the first fusion 
protein to the second fusion protein; 

(c) permitting the screening molecule to bind to the first 
5 fusion protein and to the second fusion protein, bringing te two 

fusion proteins in to proximity so as to activate the expression 
of the reporter gene; 

(d) selecting the cell that expresses the reporter gene; 

and 

10 (e) identifying the small molecule that binds the known 

target receptor. 

The cell may be selected from the group consisting of insect 
cells, yeast cells, mammalian cell, and their lysates. 

15 

The parts of the fusion proteins are as described previously. 

In the method, the molecule may be obtained from a combinatorial 
library. 

20 

The method may have steps (b)-{e) iteratively repeated in the 
presence of a preparation of random small molecules tor 
competitive binding with the screening molecule so as to 
identify a molecule capable of competitively binding the known 
25 target receptor. 

Also disclosed is a method for identifying an unknown target 
receptor to which a molecule is capable of binding in a cell, 
comprising: 

30 (a) providing a screening molecule having a ligand which 

has a specificity for the unknown target receptor covalently 
bonded to a substrate capable of selectively binding to and 
selectively forming a covalent bond with a receptor; 

(b) introducing the screening molecule into a cell which 

35 expresses 

a first fusion protein of a DNA-binding domain fused 
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to the unknown target receptor domain against which the 
candidate molecule is screened, 

a second fusion protein which comprises a receptor 
domain capable of binding to and forming a covalent bond 
5 with the screening molecule, and 

a reporter gene wherein expression of the reporter 
gene is conditioned on the proximity of the first fusion 
protein to the second fusion protein; 
. (c) permitting the screening molecule to bind to the first 
10 fusion protein and to the second fusion protein so as to 
activate the expression of the reporter gene; 

(d) selecting which cell expresses the unknown target 
receptor; and 

(e) identifying the unknown target receptor. 

15 

In the method, the unknown protein target is encoded by a DNA 
from the group consisting of genomicDNA, cDNA and synthet icDNA, 

In the method, the ligand may have a known biological function. 

20 

Also disclosed are compounds having the formula: 
H1-Y-H2 

25 

wherein HI is Mtx or an analog thereof; 

wherein H2 is a substrate capable of binding to a receptor, 

and 

wherein Y is a moiety providing a covalent linkage between 
30 HI and H2, which may be present or absent, and when absent, HI 
is covalently linked to H2 . 

The specific structures of the compounds are as show below: 



35 
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Also disclosed is a complex between the above compounds and a 
fusion protein which comprises a binding domain capable of 
binding to methotrexate, wherein HI of the compound binds to the 
binding domain of the fusion protein. 
5 In this complex, the binding domain is that of the DHFR 
receptor, or the fusion protein is DHFR-LexA or DHFR-B4 2 . 

Also disclosed is a cell comprising this complex, and a method 
for screening a cDNA library by identifying the expressed 
10 protein target, comprising: 

(a) providing a screening molecule comprising a 
methotrexate moiety or an analog of methotrexate covalentiy 
bonded to a ligand which has a known specificity; 

(b) introducing the screening molecule into a cell which 
15 expresses a first fusion protein comprising a binding domain 

capable of binding methotrexate, a second fusion protein 
comprising the expressed unknown protein target, and a reporter 
gene wherein expression of the reporter gene is conditioned on 
the proximity of the first fusion protein to the second fusion 
20 protein; 

(c) permitting the screening molecule to bind to the first 
fusion protein and to the second fusion protein so as to 
activate the expression of the reporter gene; 

(d) selecting which cell expresses the reporter gene; and 
25 (e) identifying the unknown protein target and the 

corresponding cDNA. 

In this method, the unknown protein target may be encoded by a 
DMA from the group consisting of genomicDNA, cDNA and 
30 syntheticDNA. Other elements are as described previously. 

Also disclosed is a new protein cloned by the method. 



35 



In any embodiment of the invention, the cell may be selected 
from the group consisting of insect cells, yeast cells, 
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mammalian cell, and their lysates. The first or the second 
fusion protein may comprise a transcription module selected from 
the group consisting of a DNA binding protein and a 
transcriptional activator. Also, the molecule may be obtained 
5 from a combinatorial library, 

Any of the described methods can be adapted to determine the 
cellular function of a natural protein. The methods can also 
be adapted to identify the cellular targets of a drug, this 
10 method further comprising screening with the drug in question 
being part of the CID. 

The described methods may also be adapted to identify new 
protein targets for pharmaceuticals, 

15 

The described methods may also be adapted for determining the 
function of a protein, this method further including screening 
with a natural cofactor being part of the CID. 

20 The described methods may also be adapted for determining the 
function of a protein, this method further including screening 
with a natural substrate being part of the CID. 

The described methods may also be adapted for screening a 
25 compound for the ability to inhibit a ligand-receptor 
interaction. 

In any of the described embodiments, each of HI and H2 is 
capable of binding to a receptor with a IC 50 of less than 100 nM. 
30 In a preferred embodiment, each of Hi and H2 is capable of 
binding to a receptor with a IC 50 of less than 10 nM. In the 
most preferred embodiment, each of HI and H2 is capable of 
binding to a receptor with a IC 50 of less than 1 nM. 



35 Each of HI or H2 may be derived from a compound selected from 
the group consisting of steroids, hormones, nuclear receptor 
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ligands, cofactors, antibiotics, sugars, enzyme inhibitors, and 
drugs . 

Bach of HI and H2 may also represent a compound selected from 
5 the group consisting of dexaitiethasone, 3 , 5, 3 ' -triiodothyronine, 
trans-retinoic acid, biotin, coumermycin, tetracycline, lactose, 
methotrexate, FK506, and FK506 analogs. 

In any of the described methods, the cellular readout may be 
10 gene transcription, such that an increase in gene transcription 
indicates catalysis of bond formation by the protein to be 
screened. 

In the described methods, the screening is performed by 
15 Fluorescence Associated Cell Sorting (FACS), or gene 
transcription markers selected from the group consisting of 
Green Fluorescence Protein, LacZ-p-galagctosidases, luciferase, 
antibiotic resistant p-lactarnases , and yeast markers. 

20 As discussed, the foregoing may be adapted to the determination 
of the binding specificity of biomolecules is important not only 
for understanding the mechanisms and pathways of biological 
systems, but also because this binding specificity provides 
information for the future development of therapeutic and 

25 diagnostic agents. This invention describes a cell-based assay 
for detecting binding activities of steroids to better 
understand their in vivo molecular recognition. Steroid 
hormones are essential for the regulation of salts and water in 
the body, for metabolism, and for the maturation and sexual 

30 development of males and females. Moreover, the development of 
several kinds of cancer has been linked directly to steroids as 
causative agents. Due to the necessary role that steroids play 
in bodily functions, it is important to learn about their 
interactions with cellular targets to understand how they 

35 demonstrate this dual behavior. The screen builds from existing 
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technology for dimerizing proteins within cells, using chemical 
inducers of dimerization ("CID" or xx CIDs") . By using a steroid 
as one of the ligands of the dimeric small-molecule CIDs, 
binding can be detected. 

5 

The foregoing embodiments of the subject invention may be 
accomplished according to the guidance which follows. Certain 
of the foregoing embodiments are exemplified. Sufficient 
guidance is provided for a skilled artisan to arrive at all of 
10 the embodiments of the subject invention. 
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Selection Strategy 

The selection strategy is based on existing methods for 
controlling protein dimerization In vivo using small molecules 
(Fig. 1) . Several "chemical inducers of dimerization" have been 
reported showing that protein dimerization can be bridged by 
small molecules. (Spencer; Crabtree) Moreover, a number of 
techniques exist for translating the dimerization of two 
proteins to an in vivo screen or selection. (Hu 1990; Hu 1995; 
Fields; Gyuris; Johnsson; Rossi; Karimova) Taken together, this 
work establishes that it is feasible to use a small molecule HI - 
H2 to dimerize two fusion protein, reporter V-Hl receptor and 
reporter W-H2 receptor, generating a cellular read-out. 

Disclosed a general method for screening a cDNA library based 
on the ability of members of that library to express a protein 
capable of binding to HI or H2 or an ability of that protein to 
catalyze a reaction to either form or cleave the oovalent 
coupling between HI and H2 • That is, the small-molecule Hl-X- 
BOND-Y-H2 represented in Fig. 1 is used to mediate protein 
dimerization and hence a cellular signal. Then the polypeptide 
enzyme that binds to either HI or H2 is selected. The selection 
is tied to the cellular "read-out" because only cells containing 
the polypeptide which binds will have the desired phenotype. 

The strategy is both general and a direct selection for 
polypeptides which bind. The selection can be applied to a 
broad range of polypeptides because protein dimerization depends 
only on the HI and H2 selected. It is a direct selection for 
the polypeptides because binding of HI and H2 is necessary for 
protein dimerization. Also, this strategy does not limit the 
starting protein scaffold. 

Preparation and design of handl es "HI" and "H2" 
Ideally, a chemical handle should bind its receptor with high 
affinity U 100 nM) , cross cell membranes yet be inert to 
modification or degradation, be available in reasonable 
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quantities, and present a convenient side-chain for routine 
chemical derivatization that does not disrupt receptor binding. 
Again, we build from DEX-FK5Q6 (H1-H2) mediated dimerization of 
LexA-rGR and B42-FKBP12 (Fig. 2) (Licitra) . 

5 

Dexamethasone (DEX) is a very attractive chemical handle HI 
(Fig. 5A) . DEX binds rat glucocorticoid receptor (GR) with a 
K D of 5 nM, (Chakraborti) can regulate the in vivo activity and 
nuclear localization of GR fusion proteins (Picard), and is 
10 commercially available. Affinity columns for rGR have been 
prepared via the C 20 ^-hydroxy ketone of dexamethasone, 
(Govindan; Manz) 

The antibacterial and anticancer drug methotrexate (MTX) is used 
15 in place of FK506 as the chemical handle H2 (Fig. 5B, 5C) . 
FK506 is not available in large quantities, coupling via the C 2i 
allyl group requires several chemical transformations including 
silyl protection of FK506, (Spencer; Pruschy) and FK506 is both 
acid and base-sensitive. (Wagner; Coleman) MTX, on the other 
20 hand, is commercially available and can be modified selectively 
at its y-carboxylate without disrupting dihydrof olate reductase 
(DHFR) binding. (Kralovec; Bolin) Even though MTX inhibits DHFR 
with pM affinity, (Bolin; Sasso) both E. coli and S. cerevisiae 
grow in the presence of MTX when supplemented with appropriate 
25 nutrients, (Huang) 

The ability of DEX-MTX to mediate the dimerization of LexA-rGR 
and B42-DHFR is tested by (1) synthesis of a series of DEX -MTX 
molecules with simple diamine linkers (Fig. 6); and (2) showing 

30 that DEX-MTX can dimerize LexA-rGR and B42-DHFR based on lacZ 
transcription and that both DEX and MTX uncoupled, can, 
competitively disrupt this dimerization. Cell permeable 
chemical handles that can be prepared readily and that are 
efficient at inducing protein dimerization not only are 

35 essential to the robustness of this selection methodology but 
also should find broad use as chemical inducers of protein 
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dimerization. 

Dexamethasone (DEX) and the glucocorticoid receptor (GR) present 
a particularly attractive chemical handle/ receptor pair. 
Dexamethasone is the cortical steroid with the highest affinity 
for the rat Glucocorticoid Receptor. The rGR binds DEX with a 
K D of 5 nM, and mutants of rGR have been isolated with up to 10- 
fold higher affinity for DEX . (Chakraborti) The steroid 
dexamethasone has been used extensively as a cell-permeable 
small molecule to regulate the in vivo activity and nuclear 
localization of GR fusion proteins. (Picard) This work firmly 
establishes that DEX is cell permeable and is not modified or 
broken down in the cell. Recently, there has been one report 
of a yeast "three-hybrid" system in which a GR-DNA-binding 
protein fusion and a FKBP12-transcript ion activation domain 
fusion could be dimerized by the small molecule DEX-FK506 (Fig, 
2) . Dexamethasone is commercially available in large 

quantities. Affinity columns for rGR have been prepared via 
oxidation of the C 20 a~hydroxy ketone of DEX to the 
corresponding carboxylic acid. (Govindan, Manz) 

Methotrexate (MTX) inhibition of dihydrof olate reductase (DHFR) 
is one of the textbook examples of high-affinity ligand binding. 
The interaction between MTX and DHFR is extremely well 
25 characterized both biochemically and structurally, DHFR is a 
monomeric protein and binds MTX with picomolar affinity. (Bolin, 
Sasso) Even though MTX inhibits DHFR with such high affinity, 
both E. coli and S. cerevisiae grow in the presence of MTX when 
supplemented with appropriate nutrients. (Huang) The ability 
30 of MTX to serve both as an antibacterial and an anticancer agent 
is clear evidence that MTX has excellent pharmacokinetic 
properties. MTX is known to be imported into cells via a 
specific folate transporter protein. MTX is commercially 
available and can be synthesized readily from simple precursors. 
35 MTX can be modified selectively at its g-carboxylate without 
disrupting its interaction with DHFR . (Kralovec, Bolin) There 
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are several examples reported where MTX has been modified via 
its g-carboxylate to prepare affinity columns and antibody 
conjugates . 

5 Given the number of cellular pathways that depend on cascades 
of dynamic protein-protein interactions, the ability to regulate 
protein oligomerization in vivo with small molecules should have 
broad applications in medicine and basic science. The key to 
realizing the potential of these small molecules both for the 
10 catalysis screen in the laboratory and for these biomedical 
applications is developing H1-H2 molecules that can be prepared 
readily and are efficient at inducing protein dimerization in 
vivo. 

15 Other handles HI and H2 may be for example, steroids, such as 
the Dexamethasone used herein; enzyme inhibitors, such as 
Methotrexate used herein; drugs, such as KF506; hormones, such 
as the thyroid hormone 3, 5, 3 1 -triiodothyronine (structure below) 




30 



Ligands for nuclear receptors, such as retinoic acids, for 
35 example the structure below 




General cofactors, such as Biotin (structure below) 




15 

and antibiotics, such as Coumermycin (which can be used to 
induce protein dimerization according to Perlmutter et al,, 
Nature 383, 178 (1996) ) . 

20 Derivative of the mentioned compounds with groups suitable for 
linking without interfering with receptor binding can also be 
used,, 

It has been found that the combination of the Mtx moiety 
25 containing CID with DHFR binding domain containing fusion 
protein is a highly useful and widely applicable, Mtx and the 
DHFR receptor present a particularly attractive chemical 
handle/receptor pair. In addition to having a picomolar binding 
affinity, the complex of an Mtx moiety and the DHFR binding 
30 domain is extremely well characterized. The excellent 
pharmacokinetic properties of Mtx make it an ideal moiety to be 
used in procedures where ease of importation into cells is 
required. 



35 Linking HI and H2 through a linker 

To illustrate how the handles HI and H2 may be linked together, 
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several of the DEX-DEX compounds that have been synthesized to 
date are shown in Figure 9. The linkers are all commercially 
available or can be prepared in a single step. The linkers vary 
in hydrophobicity, length, and flexibility, a series of DEX-DEX 
5 molecules have been synthesized (Fig. 9). The DEX-DEX molecules 
shown in Figure 9 were prepared from Dexamethasone and the 
corresponding diamines. The C 20 a-hydroxy ketone of 

dexamethasone was oxidized using sodium periodate to the 
corresponding carboxylic acid in quantitative yield as 

10 described. The diamines are commercially available. The 
diamine corresponding to DEX-DEX 2 was prepared from a,a'- 
dibromo-in-xylene and aminoethanethiol and used crude. The 
diamines were coupled to the carboxylic acid derivative of 
dexamethasone using the peptide-coupling reagent PyBOP under 

15 standard conditions in 60-80% yield. 

We have synthesized a DEX-MTX molecule. The retrosynthesis is 
shown in Figure 10, The synthesis is designed to be modular so 
that we can easily bring in a variety of linkers in one of the 

20 final steps as the dibromo- or diiodo-derivatives , For 
synthetic ease, the glutamate residue has been replaced with 
homocysteine. This replacement should be neutral because there 
is both biochemical and structural evidence that the g- 
carboxylate of methotrexate can be modified without disrupting 

25 DHFR binding. The final compound has been synthesized in 12 
steps in 1.3% overall yield. Also synthesized are analogous 
compounds where the a, a '-dibromo~/n~xylene linker is replaced 
with 1, 5-diiodopentane or 1 , 10-diiododecane . A similar route 
is used to prepare MTX-MTX molecules . 

30 

Design of the protein chimeras 

The second important feature is the design of the protein 
chimeras. The yeast two-hybrid assay was chosen in the examples 
because of its flexibility. Specifically, the Brent two-hybrid 
35 system is used, which uses LexA as the DNA-binding domain and 
B42 as the transcription activation domain. The Brent system 
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is one of the two most commonly used yeast two-hybrid systems. 
An advantage of the Brent system is that it does not rely on 
Gal4 allowing use of the regulatable Gal promoter. lacZ under 
control of 4 tandem LexA operators are used as the reporter 
5 gene. Initially, we chose to make simple LexA-GR and DHFR and 
B42-GR and DHFR fusion proteins that do not depart from the 
design of the Brent system. In the Brent system, the full 
length LexA protein which includes both the N-terminal DNA- 
binding domain and the C-terminal dimerization domain is used. 

10 The B42 domain is a monomer. The C-terminal hormone-binding 
domain of the rat Glucocorticoid Receptor is chosen because this 
domain was shown to work previously in the yeast three-hybrid 
system reported by Licitra, et al. Both the coli and the 
murine DHFRs are used because these are two of the most well 

15 characterized DHFRs. The E. coli protein has the advantage that 
methotrexate binding is independent of NADPH binding. 

Construction of the LexA- and B42-receptor fusions is 
facilitated by the availability of commercial vectors for the 

20 Brent two-hybrid system. These vectors are shuttle vectors that 
can be manipulated both in bacteria and yeast. The LexA chimera 
is under control of the strong, constitutive alcohol 
dehydrogenase promoter. The B42 chimera is under control of the 
strong, regulatable galactose promoter. Both the GR and the two 

25 DHFR genes were introduced into the multiple cloning sites of 
the commercial LexA and B42 expression vectors using standard 
molecular biology techniques. The GR fusions are shown in 
Figure 11. The available restriction sites result in a three 
amino acid spacer between the two proteins in both the GR and 

30 the DHFR constructs. The plasmids encoding the LexA- and B42- 
fusion proteins were introduced in all necessary combinations 
into S. cerevisiae strain FY250 containing a plasmid encoding 
the lacZ reporter plasmid, 

35 Three initial assays are conducted: (1) toxicity of the ligand 
and receptor, (2) cell permeability of the H1-H2 molecules as 
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judged by competition in the yeast three-hybrid system, and (3) 
activation of lacZ transcription by the H1-H2 molecule as judged 
by X-gal hydrolysis. All of these experiments have been done 
as plate assays. The toxicity of the ligand and receptor is 
judged simply by seeing if either induction of the receptor 
fusions or application of the ligand to the plate impairs cell 
growth. Cell permeability is assessed based on the ability of 
an excess of DEX-DEX or DEX-MTX to disrupt DEX-FK506 induction 
of lacZ transcription in the yeast three-hybrid system. An 
excess of DEX-DEX or DEX-MTX should bind to all of the available 
LexA-GR chimera and disrupt transcription activation so long as 
the molecule is cell permeable and retains the ability to bind 
to GR, Effective protein dimerization by H1-H2 is assayed by 
activation of lacZ transcription. 

The DEX-DEX molecules is tested by all three assays. 
Preliminary results show that neither DEX nor GR are toxic. 
Under the conditions tried thus far, none of the DEX-DEX 
molecules tested are efficient at protein dimerization as judged 
by the lacZ transcription assay* We have been able to repeat 
the yeast three-hybrid result - activation of lacZ transcription 
using DEX-FK506, in our lab. DEX-DEX 1 and DEX-DEX 5 have been 
assayed for cell permeability. At 1 DEX-FK506 and 10 \iH DEX- 
DEX, DEX-DEX 1, but not DEX-DEX 5, decreases lacZ transcription 
in the yeast three-hybrid system by 50%. These results show 
that a DEX-DEX molecule is cell permeable and retains the 
ability to bind to GR. 

The protein chimeras are varied in four ways: (1) invert the 
orientation of the B42 activation domain and the receptor; (2) 
introduce tandem repeats of the receptor; (3) introduce 
(GlyGlySer) n linkers between the protein domains; (4) vary the 
DMA-binding domain and the transcription activation domain. We 
expect these experiments to be carried out over the next two 
years. The motivation for these experiments is that many 
different protein fusions have been reported in the literature 
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and these types of modifications have been shown to be critical 
in these previous experiments. We have designed each of these 
experiments so that multiple variations can be made 
simultaneously. Inverting the orientation so that the receptor, 
5 not B42, is ^-terminal is trivial. We will construct a generic 
vector that can be used with different receptors. Likewise, 
since several different DNA-binding domains and activation 
domains have been used with the yeast two-hybrid system, it is 
not difficult to vary these domains. 

10 

An approach to introducing tandem repeats of the receptor and 
(GlyGlySer) n linkers that allows us to make multiple constructs 
simultaneously is provided. As illustrated for GR, the approach 
to making tandem repeats of the receptor is to use restriction 

15 enzymes with compatible cohesive ends (Fig. 14) . The same PCR 
product can then be used to introduce each receptor unit. By 
including a BamHI restriction site immediately 5* to the gene 
encoding GR, a series of (GlyGlySer) n linkers can be introduced 
essentially as described-, This approach relies on the fact that 

20 the BamHI site, GGA-TCC, encodes Gly-Ser . This combined 
approach will allow for the construction of multiple protein 
chimeras simultaneously. Since a lacZ screen us used, all of 
these constructs can be assayed simultaneously. 

25 Design of reporter genes 

A reporter gene assay measures the activity of a gene's 
promoter. It takes advantage of molecular biology techniques, 
which allow one to put heterologous genes under the control of 
a mammalian cell (Gorman, CM. et al . , Mol. Cell Biol. 2: 1044- 

30 1051 (1982); Alam, J. And Cook, J.L., Anal, Biochem. 18£; 245- 
254, (1990)). Activation of the promoter induces the reporter 
gene as well as or instead of the endogenous gene. By design 
the reporter gene codes for a protein that can easily- be 
detected and measured. Commonly it is an enzyme that converts 

35 a commercially available substrate into a product. This 
conversion is conveniently followed by either chromatography or 
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direct optical measurement and allows for the quantification of 
the amount of enzyme produced. 

Reporter genes are commercially available on a variety of 
5 plasmids for the study of gene regulation in a large variety of 
organisms (Alam and Cook, supra) . Promoters of interest can be 
inserted into multiple cloning sites provided for this purpose 
in front of the reporter gene on the plasmid (Rosenthal, N., 
Methods Enzymo. 152: 704-720 (1987); Shiau, A. and Smith, J.M., 

10 Gene j67: 295-299 (1988)). Standard techniques are used to 
introduce these genes into a cell type or whole organism (e.g., 
as described in Sambrook, J., Fritsch, E . F, and Maniatis, T. 
Expression of cloned genes in cultured mammalian cells. In; 
Molecular Cloning, edited by Nolan, C. New York; Cold Spring 

15 Harbor Laboratory Press, 1989) , Resistance markers provided on 
the plasmid can then be used to select for successfully 
transfected cells. 

Ease of use and the large signal amplification make this 
20 technique increasingly popular in the study of gene regulation. 
Every step in the cascade DNA — > RNA — > Enzyme — > Product — > 
Signal amplifies the next one in the sequence. The further down 
in the cascade one measures, the more signal one obtains. 

25 In an ideal reporter gene assay, the reporter gene under the 
control of the promoter of interest is transfected into cells, 
either transiently or stably, Receptor activation leads to a 
change in enzyme levels via transcriptional and translat ional 
events. The amount of enzyme present can be measured via its 

30 enzymatic action on a substrate. 

Host Cell 

The host cell for the foregoing screen may be any cell capable 
of expressing the protein or cDNA library of proteins to be 
35 screened. Some suitable host cells have been found to be yeast 
cells, Saccharomyces Cerevisiae, and E. Colin 
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This invention will be better understood from the Experimental 
Details which follow. However, one skilled in the art will 
readily appreciate that the specific methods and results 
discussed are merely illustrative of the invention as described 
5 more fully in the claims which follow thereafter. 

EXPERIMENTAL DETAILS 
10 Example 1 

We have shown that Dex-Mtx can dimerize a LexA-DHFR and a B42- 
rGR protein chimera in vivo (Table I) . (Lin, 1999) Dex-Mtx was 
assayed using both plate and liquid assays at extracellular 
concentrations of 1-100 fM. No activation was observed at 

15 concentrations < 0.1 /M. 100 jM is the limit of Dex-Mtx 
solubility. Control experiments established that lacZ 

transcription is dependent on Dex-Mtx. There are only 
background levels of lacZ transcription when Dex-Mtx is omitted, 
LexA-DHFR is replaced with LexA, or B42-GR is replaced with B42. 

20 Likewise, a 10-fold excess of Mtx competes out Dex-Mtx-dependent 
lacZ transcription. Interestingly, of the 10 protein chimera 
combinations tested, Dex-Mtx could only activate lacZ 
transcription in the context of the LexA-eDHFR and the B42- 
(Gly6) -rGR chimeras (Table 1), None of the 9 other protein 

25 combinations tested worked. This result is consistent with our 
view that the Dex-Mtx systems (and other dimerization systems) 
could be further improved both, by biochemical and structural 
characterization and by variation of the protein chimeras and 
the reporter . 

30 
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Table I, 



Effect of DEX-Mtx on Dimerization of Different LexA-and 
B42-Protein Fusions 



strain* 


, LexA j Chimera 


, B42 u Chirr^a _ 


_ Dex~Wt,& 


l 


LexA-eDHFR c 


B42-Gly 6 d -rGR2 e 


Yes 


2 


LexA-eDHFR 


B42-rGR2 


No 


3 


LexA-eDHFR 


B42- (rGR2) 3 


No 


4 


LexA-mDHFR r 


B42-Gly 6 -rGR2 


No 


5 


LexA-mDHFR 


B42-rGR2 


No 


6 


LexA-mDHFR 


B42-(rGR2) 3 


No 


7 


LexA-rGR2 


B42~eDHFR 


No 


8 


LexA-rGR2 


B42-mDHFR 


No 


9 


LexA-(rGR2) 3 


B42-eDHFR 


No 


10 


LexA-(rGR2) 3 


B42-mDHFR 


No 



a S. Cerevisiae strain FY250 containing pMW106 (the lacZ reporter 
plasmid) , pMW!03 (encoding the LexA chimera) , and pMW012 (encoding the 
B42 chimera) . b Dex-Mtx~dependent dimerization was determined using 
standard assays for lacZ transcription. See the text for details. 
c the E. coll DHFR. d In some constructs a 6 Glycine linker was added 
between B42 and the rGR. C A mutant form of the hormone-binding domain 
of the glucocorticoid receptor (residues 524-795, Phe 620 -Ser, Cys 636 
-Gly) with increased affinity for Dex was used in these studies. F the 
murine DRFR. 



Example 2 

Cephalosporin Hydrolysis by the 

9Q8R Cephalosporinase in the yeast three-hybrid system 

The subject invention is exemplified using the components of the 
yeast three-hybrid system (Licitra, represented in Fig. 2) . In 
this system DEX-FK506 (exemplifying H1-H2) mediates dimerization 
of the protein fusions LexA-GR (representing reporter V-Hl 
receptor) and B42-FKBP12 (representing reporter W-H2 receptor) 
thus activating transcription of a lacZ reporter gene. The 
chemical handles HI and H2 and the protein dimerization assay, 
however, all can be varied. 
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In the subject invention, however, the yeast three-hybrid system 
is altered by inserting a BOND, B, as well as any required 
spacers X and Y, so as to form a small molecule having the 
structure H1-X-B-Y-H2. While there is ample precedent for 
5 small-molecule mediated protein dimerization, what remains is 
to show these assays can be used to select for catalysts. 
Cephalosporin hydrolysis by a cephalosporinase provides a simple 
cleavage reaction to demonstrate the selection (Fig. 3) . The 
BOND, B in this example is cephem linkage susceptible to attack 
10 by cephalosporinase, such that hydrolysis of the 
cephalosporinase results in separation of the proteins and 
deactivation of the transcription of lacZ. 

The E, cloacae 908R cephalosporinase is well characterized both 
15 biochemically (Galleni; Galleni; Galleni; Monnaie) and 
structurally (Lobkovsky) and is simple to manipulate. Several 
approaches have been developed for modifying cephalosporin 
antibiotics at the C7 ' and C3 ' positions to improve their 
pharmacokinetic properties and to prepare pro-drugs. 
20 (Druckheimer; Albrecht; Vrudhula; Meyer) 

Cephalosporin hydrolysis by the cephalosporinase can disrupt 
protein dimerization and hence be used to discriminate between 
cells containing active and inactive enzyme. Specifically, (1) 

25 (C . ) DEX-CEPHEM- (C3 1 ) FK506 is synthesized; (2) DEX-CEPHEM-FK50 6 
is shown to dimerize LexA-GR and B42-FKBP12 and both DEX and 
FK506 is shown to disrupt the dimerization; (3) induction of the 
wild type cephalosporinase, but not an inactive Ser 64 variant, 
is shown to disrupt cephem-mediated protein dimerization; and 

30 (4) cells containing active cephalosporinase are identified 
based on loss of protein dimerization in a mock screen. A 
screen for loss of lacZ transcription is sufficient for the 
screen, 

35 The retro-synthesis of DEX-CEPHEM-FK506 is shown in Figure 4; 
it allows HI, H2, and the linker molecules to be varied. The 



WO 02/059272 



PCT/US02/02199 



-32- 

allelic chloride intermediate 2 has been synthesized from cephem 
1 in 20% yield in four steps. Mild conditions for coupling H2- 
SH to the allelic chloride 2 using sodium iodide have been 
developed; DEX-SH can be coupled in 82% yield, 908R 
5 cephalosporinase variants have been constructed both with and 
without nuclear-localization sequences under control of GAL1 and 
MET25 promoters. All of these variants are known to be active 
in vivo by using the chromogenic substrate nitrocefin, 
(Pluckthun) . * Several S. cerevisiae strains suitable for this 
10 model reaction have been constructed. DEX-FK5Q6 is know to 
dimerize LexA-rGR and B42-FKBP12 in these strain backgrounds 
(yeast three-hybrid system) . 

All of the components needed for the proof of principle have 

15 been prepared. Specifically, we have developed a modular 
synthesis of Dex-cephem-Mtx and constructed a S, cerevisiae 
strain suitable for the proof principle. The retro-synthesis 
of Dex-cephem-Dex is shown in Figure 12; it allows HI, H2, and 
the linker molecules to be varied to optimize the cephem 

20 substrate. We have synthesized the allylic chloride 

intermediate 2 from cephem 1 in 20% yield in four steps. We 
have developed mild conditions for coupling H2-SH to the allylic 
chloride 2 using sodium iodide; Dex-SH can be coupled in 82% 
yield. We have constructed strain 

25 FY250/pMW106/pMW2rGR2/pMW3FKBP12 and shown that Dex-FK506 can 
still mediate dimerization of LexA-rGR and B42-FKBP12 in this 
strain. The strain provides an additional marker for the 
enzyme, grows well on galactose and raffinose, and replaces all 
of the amp R markers with kan R or spec* markers. In addition, we 

30 have constructed several constructs for the galactose- or 
methionine-regulated overexpression of the cephalosporinase. 
Based on hydrolysis of the chromogenic substrate nitrocefin, 
(Pluckthun, 1987) we have shown that the cephalosporinase is 
active in the FY250 background. 

35 

The basis for catalysis by the cephalosporinase is studied using 
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combinatorial techniques. Understanding the mechanism is 
important for anticipating future routes to antibiotic 
resistance and for developing new cephalosporin antibiotics, 

5 Dex-cephem-Mtx induces protein dimerization in vivo 

Preparation of a Dex- cephem -Mtx (cleavable cephem linker) 

The cephem substrates were designed such that introduction of 
the Dex and Mtx ligands would not interfere with 

10 cephalosporinase hydrolysis of the cephem core and so that a 
variety of Dex-cephem~Mtx substrates could be synthesized 
readily from commercially available materials. (The chemistry 
of the b- lactams} Durckheimer; Albrecht ; Meyer; Zlokarnik) We 
synthesized four potential Dex-cephem-Mtx substrates from a 

15 commercial amino- chloro- cephem intermediate. Dexamethasone 
was coupled to the C7 amino group of the cephem core via 
aminocarboxylic acids of different lengths, and methotrexate to. 
the C3' chloro group via aminothiols of different lengths. All 
four compounds were prepared from three components in 3-4 steps 

20 in 10-30% overall yield. 

The critical issue was whether introduction of the cephem linker 
would impede either the cell permeability or the dimerization 
activity of the Dex-Mtx CID. We screened all four Dex-cephem- 

25 Mtx compounds using the yeast two-hybrid lacZ transcription 
assay and determined that all four compounds are cell permeable 
and that two of these compounds are capable of inducing protein 
dimerization in vivo, as shown in Figure 15. Based on these 
results, it appears that the length of the linkers between the 

30 cephem core and the Dex and Mtx ligands are important; the 
cephem core must not be too close to the receptor or it will 
prevent access to the receptor. These results support the 
general feasibility of preparing CIDs with cleavable linkers and 
using these compounds in vivo with the catalysis screen. 

35 

The ability of this Dex-cephem-MTX CID to serve as a read-out 
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for catalysis is evaluated using the well-studied enzymatic 
reaction, cephem hydrolysis by a cephalosporinase. Hydrolysis 
of the lactam bond results in expulsion of the leaving group at 
the C3' position, effectively breaking the bond between Dex and 
5 Mtx. 




Having identified Dex-cephem-Mtx substrates that are efficient 
10 dimerizers in the yeast two-hybrid assay, the next step is tc 
demonstrate that the screen can discriminate between active and 
inactive enzymes. The penicillin-binding protein (PBP) from 
Streptomyces R61 provides a good control "inactive" enzyme to 
compare to the active Q908R cephalosporinase. (Kelly; Ghuysen) 
15 Cephalosporinases are believed to have evolved from 
PBPs ♦ (Ghuysen; Knox) Both enzymes have the same three- 
dimensional fold and follow the same catalytic mechanism 
involving an acyl-enzyme intermediate. (Kelly, Lobkovsky) PBPs 
bind to cephems with high affinity, form the acyl-enzyme 
20 intermediate rapittly, but hydrolyze the acyl-enzyme intermediate 
much more slowly than do Cephalosporinases. We have introduced 
both the Q908R cephalosporinase and the R61 PBP into yeast 
shuttle vectors that place the enzymes under control of either 
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a galactose-inducible or a methionine-repressible promoter. 
Based on plate assays using the chromagenic substrate 
nitrocefin, (Pluckthun) the Q908R enzyme was expressed in an 
active form in yeast with' either promoter. This assay cannot 
detect PBP activity. 

The Dex-cephem-Mtx CID screen distinguish between the 
cephalosporinase and the PBP . Yeast strains containing the 
cephalosporinase hydrolyze the cephem . linkage rapidly, 
disrupting lacZ transcription. The PBP, on the other hand, 
hydrolyze the cephern linkage too slowly to change the levels of 
lacZ transcription significantly. 

Can the CID screen detect catalytic activity? 

Strong support for the feasibility of using CIDs with cleavable 
linkers to detect catalytic activity is provided by in vivo 
selections for protease activity based on cleavage of internal 
protease sites engineered in a variety of proteins, including 
Gal4. With an active Dex-cephem-Mtx CID in hand, our next step 
is to find conditions where the CID screen gives an enzyme- 
dependent signal. We envision two scenarios which should result 
in an enzyme-dependent signal; (1) overexpression of the enzyme 
relative to the LexA- and B42-repcrter proteins and (2) 
expression of the enzyme prior to expression of the LexA- and 
842-reporter proteins. The Brent Y2H vectors currently employed 
in the lab will have to be modified to allow for control over 
the levels and timing of LexA- and B42~expression . As supplied, 
the Brent vectors have the LexA fusion protein under control of 
the strong, constitutive alcohol dehydrogenase promoter (P AO h) 
and • the B42 fusion protein under control of the strong 
galact ose*~inducible promoter (Pqal) * Both vectors contain the 
high-copy yeast 2u origin of replication. We plan simply to 
place the LexA fusion protein under control of a galactose- 
inducible promoter, just like B42. The GAL promoter is the most 
tightly regulated promoter available in yeast and is induced by 
galactose and repressed by glucose. It can be fully repressed, 
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and it can direct expression of a range of intermediate protein 
concentrations by varying the relative percentages of glucose 
and galactose in the growth media* Thus, with both LexA and B42 
under control of Gal promoters, these reporter proteins can be 
turned off and then on or expressed at intermediate 
concentrations in concert. If this approach does not work, 
there are many other ways to tune the sensitivity of the system. 
The expression of the enzyme, LexA, and B42 can all be 
controlled using other inducible or constitutive promoters or 
by integrating LexA and B42 into the chromosome. The lacZ 
reporter gene can be replaced with other chromagenic reporters 
or selectable markers. Alternatively, the sensitivity of the 
system can be tuned by varying the substrate:product ratio by 
adding both Dex-cephem-Mtx (substrate) and Dex and Mtx 
("product") to the growth media. 

Once conditions were found where we can detect enzyme-dependent 
cleavage of the cephem linker, we carried out a mock screen as 
a proof -of -principle experiment. Specifically, plasmids 
encoding the cephalosporinase and the PBP in a ratio of 1:99 
will be introduced into a yeast strain carrying the appropriate 
protein chimera and reporter genes. Cells harboring the 
cephalosporinase should be white, while those containing the PBP 
should be blue. Plasmids from these colonies will be isolated 
and sequenced to confirm the identity of the expressed enzyme. 

Level of catalytic activity detected using the CID screen 
While these experiments will show that the CID screen can detect 
catalytic activity, they will not show that the screen can be 
used to amplify enzymes with low levels of catalytic activity. 
Thus, our next step is to use cephalosporinase mutants with a 
range of catalytic efficiencies to quantify and then optimize 
the sensitivity of the system. Many b-lactamase mutants, either 
found in clinical settings or constructed by site-directed 
mutagenesis, have been fully characterized kinetically. Known 
mutants of the Q908R cephase, the E. cloacae P99 cephase (99% 
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identical), and the E. coli K12 AmpC b~lactamase (71% 
homologous) are available spanning a wide range of k cat , K m , and 
^cat/ K m values (Table II) . To accurately gauge the relative 
activities of the mutants in the CID and amp R screens, we will 
5 determine kinetic rate constants for the corresponding Q908R 
cephase variants with the Dex~cephem~Mtx and ampicillin 
substrates and nitrocefin as a control. The Q908R cephase 
variants will be constructed in the E. coli expression vector 
by site-directed mutagenesis, using a PCR~based method. These 
10 proteins will then be purified by nickel-affinity 
chromatography, and rate constants will be determined by UV 
spectroscopy, monitoring the disappearance of absorbance due to 
the b-lactam bond. 

15 After determining the activity of the mutants with Dex~cephem- 
Mtx and ampicillin in vitro, these same mutants are tested in 
the CID and amp* screens. In addition to plate and more 
quantitative liquid lacZ assays, the mutants will be evaluated 
using a ura3 reporter gene, Ura3 f which encodes orotidine~5' - 

20 phosphate decarboxylase and is required for uracil biosynthesis, 
is used routinely as a selectiable marker in yeast. Since large 
numbers of protein variants need to be screened for the 
evolution experiments, it will be important to move from a 
screen to a growth selection. Ura3 has the advantage that it 

25 can be used both for positive and negative selections-positive 
for growth in the absence of uracil and negative for conversion 
of 5-f luoroorotic acid (5-FOA) to 5~f luorouracil, a toxic 
byproduct. Cleavage of the cephera bond and disruption of ura3 
transcription will be selected for based on growth in the 

30 presence of 5-FOA. The advantage to the 5-FOA selection is that 
the timing of addition of both the Dex-cephem-Mtx substrate and 
5-FOA can be controlled. Several other reporter genes, however, 
have been reported. The mutants are evaluated in E, coli using 
nitrocefin screens and amp R selections. Mutants with higher 

35 activity (k cat /KJ will still show an enzyme-dependent signal 
(failure to hydrolyze X-gal or growth in the presence of 5- 
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FOA/nitrocef in hydrolysis or resistance to ampicillin) , but at 
some point these assays will not be able to detect the less 
active mutants. In addition to suggesting what range of 
activities can be detected with these assays, these experiments 
may bring surprising results . For example, it may be that 
detection correlates more strongly with k cac than with K M or 
k cat /K M . Assuming a dynamic range of >1000, we will proceed with 
the enzyme evolution experiments. Otherwise, we will focus on 
optimizing the sensitivity of the screen until we reach this 
level of sensitivity. The optimization experiments will 
continue along the same lines as the proof-of -principle 
experiments, varying the levels and timing of both protein 
expression and addition of the substrate and product, except 
they will be carried out with mutant Cephases at the limit of 
detection. 

Table II. Wild-type and mutant enzymes are shown with their 
kinetic rate constants with the chromogenic cephalosporin 
nitrocefin, as well as the percentage of wild-type k cafc /K ni as 
calculated in that experiment. 



Enzyme 


K m(f tM) 




k^/K^M-'s" 1 ) 


% WT 


E. cloacae P99 wt 


25 ±1 


780 ± 30 


3.1 xlO 7 


100 


E. cloacae Q908R wt 


23 ± 1 


780 ±30 


3.4 x 10 7 


100 


K12 AmpC wt 


500 ± 100 


490 ±90 


1.0 x 10 6 


100 


P99 286-290 TSFGN 


19 ±0.5 


261 ±7 


1.37 x 10 7 


96 


P99 286-290 LTSNR 


43 ±2 


330 ±11 


7.7 x 10 6 . 


54 


P99 286-290 NNAGY 


31 ±11 


53 ± 10 


1.7 x 10^ 


12 


K12 Y150S 


108 ±21 


2.11±0.12 


1.9 x 10 4 


~1 


K12 Y150E 


356 ± 34 


0.51 ±0.03 


1.4 x 10 3 


-o.i 


Q908R S64C 


> 1000 


> 18 


1.76 x 10" 


0.05 
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Example 3 

CIDs can foe used to screen cDNA libraries based on biochemical 
function. This glycosidase example is used to determine the 
best method for expressing the cDNA clones and to optimize the 
5 screening process. 

Proof of Principle - 3-Galactosidase Activity Assays 

Table III explains the components of each strain. Each strain 

was constructed from the parent yeast strain FY250 and also 

10 contains the pMVJ106 plasmid, which has the LacZ reporter gene 
that is turned on only in when the LexA DNA binding domain and 
the B42 activation are brought in tot he vicinity of each other. 
We use several different strains because we use DHFR from two 
different species, mDHFR is from murine, while eDHFR is from 

15 E.coli. We are asl oable to switch the small moleculebinding 
domains. For example, the strain containing LexA-eDHFR with 
B42-rGH2 is a different strain and behaves differently from the 
strain containing LexA~rGR2 with B4 2-eDHFR . We also put in 
short 6 amino acid linkers between the two domains of our 

20 protein chimeras and thus these are different strain as well. 

Next, we have chosen to screen a yeast cDNA library for proteins 
with glycosidase activity (Figure 19) . 
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Table III. Identification of stains used. (Key: eDHFR^E.coli 
Dihydrof olate Reductase; rGR2=stereoid binding domain of rat 
Glucocorticoid Receptor (aa 524-795) with point mutations; 
(rGR2) 3=trimer of rGR2; mDHFR^murineDihydrof olate Reductase; 
gly6=6 amino acid linker conaining 6 glycines; {GSG)2=6 amino 
acid linker containing glycine-serine-glycine-glycine- serine- 
glycine. ) 
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P-Galactosidase Activity Assay Results 

The results in Table IV are averages of two separate trials. 
Each strain was examined with small molecules and without small 
molecules. The absolute activity is given as the p~ 
5 galactisidase activity with small molecule subtracted from the 
p-galactosidase activity without small molecule. The average 
p-galactosidse activity for a strain without small molecule 
(i,e. the negative control) was about 100 p-galactosidase units, 
V133Y is a positive control and shows P~galactosidase activity 
10 regardless of the presence of small molecule. The (J- 
galactosidase activity of strain V494Y using varying 
concentrations of D8M is shown in Figure 18. 

Table IV - p-galactosidase Activity Assays 

15 



20 



25 



30 
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Glycoconjugates are the most functionally and structurally 
diverse molecules in natures . [Varki, 1993] Moreover, it is now 
well established that carbohydrates and protein- and lipid- 
bound saccharides play essential roles in many important 
biological processes, including cell structure, protein 
targeting, and cell-cell interactions . [Varki, 1993] 
Accordingly, glycosidases with a broad array of substrate 
specificities are required to breakdown and modify 
polysaccharides, glycoproteins, and glycolipids. 

Using CIDs with structurally diverse carbohydrate linkers, we 
screen a S. cerevisiae cDNA library based on glycosidase 
activity. There are many examples of well-characterized 
glycosidases identified in other organisms that are yet to be 
identified in S. cerevisiae. a -Amylase f Sogaard, 1993; Vihinen, 
1990; Qian, 1994 ; Wiegand, 1995; Fujimoto, 1998; Wilcox, 1984] 
and xylanase [Wong, 1988; Biely, 1997] are endo-glycosidases that 
break down polysaccharides involved in energy storage and cell 
structure, respectively. Glycoproteins are synthesized by 
modification of a core glycoside. The GlcNAcbl®Asn and 
GlcNAcbl®4GlcNAc linkages in Asn~linked carbohydrates are 
cleaved by peptide~N 4 ~ (N-acetyl-b-glucosaminyl) asparagine 
amidase {PNGase F) and endo-b-N~acetylglucosaminidases (Endo H 
and Endo Fl) , respectively. [Tarentino, 1990; Tarentino, 1992; 
Robbins, 1984; Trimble, 1991] Since each of these enzymes are 
endo-glycosidases, the CID ligands should not interfere with the 
enzyme-catalyzed reaction. Likewise, by making a small library 
of carbohydrate linkers, we screen in an undirected fashion. 

The diversity of naturally occuring carbohydrates requires us 
to make a library of Dex-Mtx CIDs with different carbohydrate 
linkers . Recent advances in the synthesis of oligosaccharides, 
both in the coupling methods [Schmidt, 1986; Toshima, 1993; 
Boons, 1996] and in the solid-phase synthesis, [Danishef sky, 
1993; Seeberger, 1998; Yan, 1994; Liang, 1996] make it possible 
to synthesize these linkers. We have chosen to use a method 
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developed by Kahne and co-workers which uses anomeric sulfoxides 
as glycosyl donors and synthesizes carbohydrates from the 
reducing to the non-reducing end. [Yan, 1994; Liang, 1996] This 
method can be used both in solution and on solid-support, can 
5 form both a- and b-glycosidic bonds, and utilizes readily- 
synthesized intermediates. Several alternative methods, 
however, are available, including Wong and co-workers' one-pot 
solution synthesis [Zhang, 1999; Ye, 2000] and the solid-phase 
glycal strategy reported by Danishefsky and co- 
10 workers. [Danishefsky, 1993; Seeberger, 1998] 

We screen a yeast cDNA library based on glycosidase activity 
using Dex-Mtx CIDs with cieavable glycosidic linkers (Fig. 12) . 
Concurrently, we identify glycosidases from a S. cerevisae cDNA 

15 library by screening for cleavage of CIDs with glycosidic 
linkages. The Dex-Mtx yeast two-hybrid assay is used as the 
screen by replacing Dex-Mtx with Dex-oligosaccharide-Mtx . 
First, we carry out a control where we screen for a known 
glycosidase, chitinase, using a defined substrate. Second, we 

20 screen for unknown glycosidases by using a small library of 
substrates with different glycosidic bonds. 

Screen of a S. cerevisiae cDNA Library Based on Glycosidase 
Activity 

25 Using Dex-Mtx CIDs with cieavable oligosaccharide linkers, we 
screen a S. cerevisiae cDNA library based on glycosidase 
activity. As a control, we screen for a known S. cerevisiae 
glycosidase, chitinase. Then, we synthesize a small library of 
Dex-carbohydrate-Mtx substrates and screen the S. cerevisae cDNA 

30 library to identify glycosidases from the >3000 ORFs of unkown 
function in S. cerevisiae. 

Introduction of a S. cerevisiae cDNA library into the CID 
selection strain 

35 The first step of both the chitinase control and the random 
oligosaccharide library is to introduce a S. cerevisiae cDNA 
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library into the CID selection strain. We use a cDNA library 
reported by Fields and co-workers . [Martzen, 1999] In this 
library, each cDNA clone is expressed as a GST-fusion protein 
under control of a copper-inducible promoter on a shuttle vector 
with a leu2 marker. [Martzen, 1999; J. R. Hudson, 1997] 
Transformation efficiencies in yeast are ca . 10 6 ~10 7 using the 
lithium acetate method, so there is ample redundancy to screen 
all 6, 000 ORFs in S. cerevisiae. Active clones can be 
identified by sequencing the plasmid. For the chitinase control 
experiment, we make a library with a subset of cDNA clones to 
test different approaches for expressing the cDNA clones. 

Can the S. cerevislae chitinase be identified using the CID 
selection? 

We begin by screening a S. cerevisiae cDNA library for a known 
glycosidase, chitinase. Chitinase hydrolyzes chitin, polymers 
of b-1, 4-linked N-acetylglucosamine (GlcNAc) that play a 
structural role in the cell . [Muzzarelli, 1977] Chitinases from 
several organisms, including 5. cerevisiae, have been cloned and 
characterized. [Correat, 1982; Kuranda, 1987; Kuranda, 1991] It 
is known that this enzyme can hydrolyze oligomers of b-1, 4- 
GlcNAc ranging from trimers to heterogeneous polymers, 
suggesting that CIDs such as Dex- (GlcNAc) n -Mtx should be 
efficient substrates for this enzyme. Several efficient 
syntheses of p~l, 4-linked GlcNAc have been published . [Banoub, 
1992] 

The retro-synthetic analysis of our Dex- (GlcNAc) n -Mtx CID 
substrate is shown in Figure 20. 

The growing carbohydrate chain is linked to the solid support 
via the Glu portion of Mtx. The glycosidic linkages are formed 
essentially as reported by Kahne and co-workers using sulfoxide 
glycosyl donors . [Yan, 1994; Liang, 1996] The final carbohydrate 
is introduced as a Dex derivative, and the Mtx synthesis is 
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completed prior to cleavage from the solid support. This 
synthesis allows the oligosaccharide linker to be varied and 
both the Dex and the Mtx ligand to be introduced before cleavage 
from solid support. Alternatively, the synthesis can be carried 
5 out in solution, [Kahne, 1989] or other methods for carbohydrate 
synthesis can be employed. [Zhang, 1999; Ye, 2000; Danishefsky, 
1993; Seeberger, 1998 We start with a GlcNAc tetramer as 
trimers have been shown to be the shortest efficient substrates 
for chitinases, [Watanabe, 1993] 

10 

Initially, lacZ plate assays are used to verify that the Dex- 
(GlcNAc) n -Mtx substrates are efficient dimerizers in the yeast 
three-hybrid assay. The results with Dex-cephem-Mtx support the 
feasibility of incorporating structurally diverse linkers into 

15 the CIDs. If the initial chitinase substrates, however, are not 
efficient dimerizers, the linkers between the CID ligands and 
the GlcNAc oligomer can be varied, or alternate dimerization 
assays can be tested. Since large numbers of cDNA clones need 
to be screened, the transcriptional read-out of the yeast three- 

20 hybrid assay may be changed from a screen to a growth selection. 
Specifically, ura3 f which encodes orotidine~5' -phosphate 
decarboxylase and is required for uracil biosynthesis, replaced 
lacZ as the reporter gene. [Boeke, 1984] Ura3 has the advantage 
that it can be used both for positive and negative 

25 selections-positive for growth in the absence of uracil and 
negative for conversion of 5-f luoroorotic acid (5-FOA) to 5- 
fluorouracil, a toxic byproduct. Cleavage of the glycosidic 
bond and disruption of ura3 transcription is selected for based 
on growth in the presence of 5-FOA. The advantage to the 5-FOA 

30 selection is that the timing of addition of both the Dex- 
(GlcNAc) n -Mtx substrate and 5-FOA can be controlled. Several 
other reporter genes, however, can be used. 

One problem that has the potential of occurring is that the Dex- 
35 (GlcNAc) n -Mtx substrate becomes unstable either because of its 
intrinsic half-life in water or because it is turned over by 
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cellular glycosidases . However, if the substrate has a short 
half-life in water, the assay conditions can be modified so that 
the substrate is added late in the assay after the cells have 
grown to a high density, the substrate can be continuously 
replenished, or the pH of the media can be buffered. Turnover 
by cellular glycosidases can simply be seen as an assay in and 
of itself. Using traditional genetic approaches, random 
mutations can be introduced into the S. cerevisiae genome or the 
tagged knock-out strains of Winzeler et al. can be 
used. [Winzeler, 1999] Cells containing a disruptive mutation 
in the gene or genes cleaving the Dex- (GlcNAc) n ~Mtx substrate 
can be selected for by growth in the absence of uracil. 

The final step is to use the Dex- (GlcNAc) f -Mtx substrate to pull 
out chitinase from a S. cerevisiae cDNA library. As described 
above, a 5-FOA growth selection is used to screen the Fields 
cDNA library. In the absence of chitinase, Dex- (GlcNAc) n ~Mtx 
induces ura3 transcription, and 5-FOA is converted to the toxic 
byproduct 5-f luorouracil . Thus, only cells containing active 
chitinase, or another enzyme that can cleave the substrate, 
survive. The cDNA clone is readily identified by isolating the 
plasmid, sequencing the N-terminus of the clone, and comparing 
this sequence to that of the S, cerevisiae genome. The 
advantage of using a known enzyme is that the enzyme can be 
tested independently or used to spike the cDNA library. The 
enzyme can be purified, and the Dex- (GlcNAc) n ~Mtx substrate can 
be tested in vitro. We can vary the format of the cDNA library, 
the Dex- (GlcNAc) n -Mtx substrate, the screen, or the assay 
conditions, or even use a different glycosidase as a control. 

Can glycosidases be identified from the >3QQ0 unassigned QRFs 
in S. cerevisiae using the CID selection? 

The next step is to determine the activity of the >3000 ORFs in 
S. cerevisiae with unknown function. To detect glycosidase 
activity, the screen is run exactly as with the chitinase 
control except using Dex-oligosaccharide-Mtx substrates with 
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different glycosidic linkages. The glycosidic linkages is based 
on the types of carbohydrates and glycoconjugates naturally 
occuring in yeast. Several activities, including 

amylase, [Sogaard, 1993; Vihinen, 1990 ; Qian, 1994; Wiegand, 
1995; Fujimoto, 1998; Wilcox, 1984] xylanase, [Wong, 1988; Biely, 
1997; Georis, 1999] and endo-N-acetylglucosamine hydrolysis 
activity, [Tarentino, 1990; Tarentino, 1992; Robbins, 1984; 
Trimble, 1991] can be targeted specifically, 

Dex-Mtx CIDs with different oligosaccharide linkers are prepared 
using the same strategy as for the chitinase substrate (above) . 
The sulfoxide glycosyl donor method for carbohydrate synthesis 
allows a variety of sugar monomers to be introduced. [Kahne, 
1989] Moreover, both the regio- and stereo-chemistry can be 
controlled- [Yan, 1994; Liang, 1996] As with the chitinase 
control, the 5-FOA growth selection is used to identify enzymes 
that cleave the various glycosidic linkages. Each glycoside 
subsrate is tested individually . Mixtures of substrates cannot 
be tested because the uncleaved substrates would continue to 
activate ura3 transcription. If the screen does not pick up any 
enzymes, known glycosidases from other organisms may be used as 
controls both for the growth selections and to test the Dex-Mtx 
substrates in vitro* 

the foregoing permits the characterization of in vitro activity 
and biological function of glycosidases identified using the CID 
screen. Similarly, cDNA libraries from other organisms can be 
screened. The Dex-Mtx substrates can be used to evolve 
glycosidases with unique specificities. In addition, the cDNA 
screen can be extended to other classes of enzymes, such as 
proteases . 
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Example 4 

Development of CID with a Suicide Substrate ("covalent CID") 

As shown in Figure 13 and the accompanying discussion, 4 non- 
covalent interactions have to take place simultaneously for the 
reporter protein to be activated. Specifically, 1) the DNA- 
binding protein-DNA interaction, 2) the 1 st ligand-receptor 
interaction, 3) the 2 nd ligand-receptor interaction, and 4) the 
activation domain-transcription machinery interaction. 

However, it is possible to replace the l sc ligand-receptor pair 
(Dex-GR in Figure 13) with a small molecule-receptor pair that 
will form an irreversible covalent linkage, making a system with 
only 3 non-covalent interactions. Such an approach allows for 
the screening of small molecules to identify their cellular 
targets. This covalent CID system is used for screening the 
ligand receptor interaction, which used to be laborious work by 
using the photo cross linking, radio labeled ligand binding and 
affinity chromatography techniques. The covalent system is more 
sensitive than the Dex-Mtx system because the covalent bond 
gives zero k off for the covalent ligand-protein binding pair and 
then the cut-off Kd of the whole system is enhanced. 

The covalent CID system is constructed by the same principles 
as other CID, except that one of the ligand-receptor pairs is 
selected so that in vivo the pair is fixed by a covalent bond 
and the cell read-out will be depended on the other ligand- 
receptor interaction. 

The covalent CID should find broad use anytime a covalent 
linkage between the ligand and receptor increases the efficacy 
of the system. The ligand might be a small molecule, e.g. a 
drug, and the target may be a protein responsible for the drugs 
efficacy or for unwanted toxic side effects. The small molecule 
may also be a cofactor or hormone and the goal might be to 
screen a genomic library to identify proteins that bind to the 



WO 02/059272 



PCT/US02/02199 



-49- 

given cofactor or hormone. In both cases, the covalent CID 
allows not only high affinity (nM) , but also moderate affinity 
(uM) , interactions to be detected. Reasonable targets for 
covalent CIDs include suicide substrate-enzyme pairs, which in 
this example are Fluorouracil-Thymidylate Synthase and Cephen- 
Penicillin Binding Protein. 



FiuoroUracil-Thymidylate Synthase 

° ' p 1 p M % ne n THF 




YY J^i H H ? Yu hn-Y-f 

. R R TS ft JS 

Cephem-Penicillin-Binding Protein 

mty H MTX v d g 

V N V_^ s -n PBP-OH Y W S 



C0 2 - o 



The above two suicide substrate/enzyme pairs are selected 
because they are stable at physiological pH and activated toward 
covalent modification only in the enzyme active-site. In 
addition, an antibiotic-bacterial enzyme pair have the advantage 
that they can readily be transferred to mammalian cells without 
toxicity effects. Furthermore, as show in our PCT International 
Publication No. WO 01/53355, the contents of which are hereby 
incorporated by reference, Dex-cephem-Mtx CIDs are cell 
permeable . 

In this example we use cephem and Streptomyces R61 penicillin 
binding protein to generate this covalent bond. The reaction 
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between p-lactam compounds and the penicillin binding protein 
is well studied, and the R61 enzyme is well-characterized 
biochemically and structurally (Kuzin et al, 1995; Kelly, 1995) . 
The synthesis of |3~lactam compounds is well established also. 
5 The Mtx-eDHFR ligand-small molecule pair is kept is this new 
system because this pair has higher affinity and better small 
molecule cell permeability than the Dex-rGR ligand-receptor 
pair. The cephera-Mtx CID shown below is synthesized by analogy 
to our syntheses of Dex-cephem~Mtx, as shown in Figure 21. 

10 



15 




The above CID for this covalent system consists of two ligands: 
one consists of MTX, which binds to DHFR; the other is cephem, 
20 which can covalently bind to R61. These two ligands were 
connected by a hydrophobic linker. We chose to incorporate the 
Mtx to the C7 position of the cephem because this position can 
be modified without disrupting the cephem' s activity, 

25 Testing of this or similar molecule for its ability to activate 
lacZ transcription in the yeast three-hybrid assay when the GR 
receptor is replaced with the R61 Penicillin-Binding Protein has 
been described. Since Mtx- DHFR variants with a broad range of 
Kds and k on and k off are known, we can use these variants to 

30 compare the ability of the noncovalent Dex CID and the covalent 
cephem CID to detect moderate affinity interactions. The cephem 
and the Mtx-cephem linker can be readily varied and other 
suicide substrate-enzyme pairs can be evaluated. 



35 



WO 02/059272 



PCT/US02/02I99 



-51- 

Sxibcloning of R61 Penicillin Binding Protein; Generation of 35 . 
coli . Strain . 

5 The elements of the covalent CID system correspond with the Dex- 
Mtx yeast three-hybrid system. It is composed of the small 
molecule, the LexA DNA binding domain chimera (LexA-DHFR or 
LexA-R61), the B42 transcription activation domain chimera (B42- 
R61 or B42-DHFR) , and the report gene (lacZ) , Plasmids of 

10 protein chimeras were constructed by subcloning and were 
transferred to yeast. 

Table 1 lists all of the R61 constructs prepared. All of the 
plasmids were sequenced and no mutation was found. 

15 

Table 1 

Plasmid on which Fusion protein Plasmid Name Strain Name 

construct is based 



p/T/R61 


R61 without EcoR I 


pTEMPPKX720 


V720E 


P MW102 


B42-R61 


P GBPKT719 


V719E 


pMW102 


B42-GSGGSG-R61 


pGBPKT779 


V779E 


pMW103 


LexA-R61 


pALPKH755 


V755E 


P MW103 


LexA-GSGGSG-R6 1 


pALPKH754 


V754E 



25 Yeast strain 

All of the final diploid yeast screening strain will be 
generated by mating. pMW102 Plasmids were transformed to FY250 
and EGY48 strains. pMW103 plasmids and reporter gene plasmids 
(pMWIOS or pMW112) were transformed to FY251 strain. Table 2 
30 lists all of the haploid yeast strains prepared. 
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Table 2 





Haploid strain 


Plasmid 


Plate 


Strain # 




FY250:V240Y 


pMW102R61 


HT_03 


NN 




FY250 


pMW!02GSGGSGR61 


HT_03 


NN 


5 


FY250 


pMW102eDHFR 


HT_03 


NN 




FY250 


pMW102 blank 


HT_03 


NN 




FY250 


P MW102rGR2 


HT_03 


NN 




EGY48: BTC 


pMW102R61 


HT_04 


NN 




EGY48 


P MW102GSGGSGR61 


HT_04 


NN 


10 


EGY48 


pMW102eDHFR 


HT_04 


NN 




EGY48 


pMW102 blank 


HT_04 


NN 




EGY48 


pMW102rGR2 


HT_04 


NN 




FY251:V525Y 


P MW103R61;pMW106 


HT_01 


NN 




FY251 


pMW103R61;pMW112 


HT_01 


NN 


15 


FY251 


P MW103eDHFR;pMW106 


HT_02 


NN 




FY251 


pMWl 03eDHFR;pMWl 12 


HT_02 


NN 




FY251 


pMWl 03GSGGSGR6 1 ;pMW 1 06 


HT_01 


NN 




FY251 


pMWl 03GSGGSGR6 1 ;pMW 1 1 2 


HT_01 


NN 




FY251 


pMW103 blank, pMWIOb 


HT_02 


NN 


20 


FY251 


pMW103 blank, pMWl 12 


HT_02 


NN 



The haploid strains of are mated, resulting in diploid strains 
that contain the pMW103, the pMW102 and the reporter plasmid, 
or some permutation thereof. The yeast strains are phroged to 

25 SC(H"U"T*) /galactose/raf finose liquid media in 96-well plates 
and incubated in a 30°C shaker for two days and then phroged to 
X-gal plates, and X-gal plates with ImM Mtx-cephem, The plates 
are incubated at 30°C for two days. The colonies of cells 
having the plasmid being selected for (for example: the strain 

30 which has R61-LexA fusion protein, DHFR-B42 fusion protein, and 
the pMW106 reporter gene) turn blue on the Mtx-cephem X-gal 
plate, but are white on the general X-gal plate. The positive 
control (yeast two hybrid system) and negative control (lacking 
one of the fusion proteins) is used during the experiment. 
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Comparison between the covalent CID system and the Dex-Mtx CID 
system shows that covalent CID system gives positive results at 
lower CID concentrations. Any mutation on the DHFR gene that 
lowers the affinity of the DHFR protein to Mtx results in a 
5 negative result in the Dex-Mtx CID system, but a positive result 
in the covalent CID system. 



10 
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Example 5 

cDNA Binding Screen: Steroids 

We can Screen proteins from cDNA libraries based on binding 
5 activity using a modified yeast-three hybrid assay. The 
screening of cDNA libraries method is based on function. The 
advantage of this method is that it is straightforward using 
existing technology. 

10 Initially we synthesize several, e.g. 5-10, CID' s each 
comprising a methotrexate moiety covalently linked to a 
different steroid. These steroid-Mtx CIDs are screened against 
a S. cerevlsiae two-hybrid library where DBD-DHFR is held 
constant and AD-cDNA library is the variable. Each time a given 

15 steroid binds to a given S. cerevlsiae protein, the reporter 
gene should be activated. The steroid-Mtx analogs can be chosen 
at will, and are their synthesis is known. 

First, we test Dexamethasone-Mtx, primarily because Dex has a 
20 common A-ring. Second, we synthesize different steroids with 
common A-ring structures . We have chosen to focus on varying 
A-rings because, 1) natural steroids often differ primarily in 
their A~rings, 2) it allows us to use the same chemistry to 
synthesize all of the steroid-analogs , and 3) there are many 
25 examples of natural steroid-receptor complexes where the A-ring 
is buried in the protein-binding pocket, while the D-ring can 
be derivatized without disrupting receptor binding. 
Specifically, we synthesize Steroid-Mtx CIDs based on the 
steroids Dexamethasone, Estrone, Progesterone, Cholesterol, and 
30 Lanosterol. These steroids are chosen because they have 
representative A-rings and because they play important 
physiological roles (Lanosterol specifically in yeast) : 

To simplify the chemistry, steroids that retain similar A/B/C 
35 rings, but have one of two D rings, may be used. Specifically, 
such steroids are 3|3-Hydroxy-5~cholen~24-oic acid (Aldrich) , 
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Eburicolic acid (Aplin Chemicals), Progesterone (Aldrich), 
Estrone (Aldrich) , and Dexamethasone (Aldrich) . 

If any steroid is not available as a carboxylic acid, it can be 
5 converted to a carboxylic acid by the representative scheme 
shown in Figure 19, 

These carboxylic acids will then be coupled to methotrexate by 
analogy to the synthesis of Dex-Mtx in Figure 20. 

10 

In addition to the dihalo linker shown in Figure 20, we 
synthesize the Steroid-Mtx CIDs with the linker 1,10- 
diiododecane, which has also been successfully used to make 
Dex-Mtx: 

15 

Screens 

These CID' s are screened against a yeast ORF library fused to 
an activation domain using the yeast three-hybrid screen. This 
screen can be done using technology already in place at GPC- 
20 Biotech, We should start screening immediately with Dex-Mtx to 
work out any bugs while we are preparing the other Steroid~Mtx 
compounds « 

Results 

25 This screen efficiently picks out both known and unknown 
steroid-binding proteins. 
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What is claimed is: 

1. A compound having the formula: 

H1-Y-H2 

wherein HI is a substrate capable of selectively binding 
to a first receptor; 

wherein H2 is a substrate capable of selectively binding 
to and selectively forming a covalent bond with a second 
receptor; and 

wherein Y is a moiety providing a covalent linkage between 
HI and H2, which may be present or absent, and when absent, HI 
is covalently linked to H2* 

2. The compound of claim 1, wherein HI is a Methotrexate 
moiety or an analog thereof. 

3. The compound of claim 1, wherein H2 is a cephem moiety 
capable of selectively binding to and selectively forming 
a covalent bond with the penicillin-binding-protein 
PPBP") . 

4* The compound of claim 1, wherein H2 is a fluorouracil 
moiety capable of selectively binding to and selectively 
forming a covalent bond with the thymidine synthase PTS") 
enzyme . 



WO 02/059272 



PCT/US02/02199 



-65- 



The compound of claim 2 having the structure: 




N 




6. h complex between the compound of claim 1 and a fusion 
protein, the fusion protein comprising a receptor domain 
which binds to the compound. 

7. The complex of claim 6 wherein the fusion protein further 
comprises a DNA-binding domain fused to the receptor 



8. The complex of claim 6 wherein the fusion protein further 
comprises a transcription activation domain fused to the 
receptor domain. 

9. The complex of claim 6, wherein the receptor domain is 
dihydrofolate reductase TDHFR") , penicillin-binding- 
protein PPBP"), or thymidine synthase ("TS") enzyme. 

10. The complex of claim 9, wherein the PBP is the Streptomyces 
R61 PBP. 

11. The complex of claim 9, wherein the DHFR is the E.coli DHFR 
PeDHFR") . 

12. The complex of claim 7, wherein the fusion protein is 
eDHFR-LexA or R61-LexA, 



domain . 



13. The complex of claim 8, wherein the fusion protein is 
eDHFR~B42 or R61-B42. 
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14. A cell comprising the complex of claim 6. 

15. A cell comprising a DNA sequence which on transcription 
gives rise to a first fusion protein exogenous to the cell 
and a second fusion protein exogenous to the cell, 
wherein the first fusion protein is a receptor domain fused 

with a DNA-binding domain; and 

wherein the second fusion protein is a transcription 
activation domain fused to either a penicillin-binding-protein 
C'PBP") or to a thymidine synthase ("TS") enzyme. 

16. The cell of claim 15, wherein the receptor domain of the 
first fusion protein is DHFR. 

17. The cell of claim 15, wherein the DNA-binding domain of the 
first fusion protein is LexA, 

18. The cell of claim 15, wherein the transcription activation 
domain of the second fusion protein is B42. 

19. The cell of claim 15, wherein the PBP is the Streptomyces 
R61 PBP. 

20. The cell of claim 15, wherein the first fusion protein is 
eDHFR-LexA, and the second fusion protein is R61-B42. 

21. The cell of claim 15, where the cell is a yeast cell, a 
bacteria cell or a mammalian cell. 

22. The cell of claim 15, where the cell is S. cerevisiae or 
E. coli. 
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23. . A method of dimerizing two fusion proteins inside a cell 

using the compound of claim 1, comprising the steps of 

a) providing a cell that expresses a first fusion protein 
which comprises a receptor domain that binds to HI, and a second 
fusion protein which comprises a receptor domain that binds to 
and forms a covalent bond with H2, and 

b) contacting the compound with the cell so as to dimerize 
the two fusion proteins. 

24. The method of claim 23, wherein the receptor domain of the 
first fusion protein is DHFR. 

25. The method of claim 23, wherein the DNA-binding domain of 
the first fusion protein is LexA. 

26. The method of claim 23, wherein the transcription 
activation domain of the second fusion protein is B42. 

27. The method of claim 23, wherein the receptor domain of the 
second fusion protein is a penicillin-binding-protein 
("PBP") or to a thymidine synthase ( VV TS") enzyme. 

28. The method of claim 27, wherein the PBP is the Slreptomyces 
R61 PBP • 

29. The method of claim 23, wherein the first fusion protein 
is eDHFR-LexA, and the second fusion protein is R61-B42. 

30. A method for identifying a molecule that binds a known 
target in a cell from a pool of candidate molecules, 
comprising; 

(a) forming a screening molecule by covalently bonding each 
molecule in the pool of candidate molecules to a substrate 
capable of selectively binding to and selectively forming a 
covalent bond with a receptor; 
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(b) introducing the screening molecule into a cell culture 
comprising cells that express 

a first fusion protein of a DNA-binding domain fused 
to a known target receptor domain against which the 
candidate molecule is screened, 

a second fusion protein which comprises a receptor 
domain capable of binding to and forming a covalent bond 
with the screening molecule, and 

a reporter gene wherein expression of the reporter 
gene is conditioned on the proximity of the first fusion 
protein to the second fusion protein; 

(c) permitting the the screening molecule to bind to the 
first fusion protein and to the second fusion protein, bringing 
te two fusion proteins in to proximity so as to activate the 
expression of the reporter gene; 

(d) selecting the cell that expresses the reporter gene; 

and 

(e) identifying the small molecule that binds the known 
target receptor. 

31. The method of claim 30, wherein the cell is selected from 
the group consisting of insect cells, yeast cells, 
mammalian cell, and their lysates. 

32. The method of claim 30, wherein the DNA~binding domain of 
the first fusion protein is LexA. 

33. The method of claim 30, wherein the transcription 
activation domain of the second fusion protein is B42. 

34. The method of claim 30, wherein the receptor domain of the 
second fusion protein is a penicillin-binding-protein 
("PBP") or to a thymidine synthase pTS") enzyme. 



WO 02/059272 



PCT/US02/02199 



-69- 

35. The method of claim 34, wherein the PBP is the Streptomyces 
R61 PBP „ 

36. The method of claim 30, wherein the molecule is obtained 
from a combinatorial library. 

37. The method of claim 30, wherein the steps (b)-(e) of the 
method are iteratively repeated in the presence of a 
preparation of random small molecules for competitive 
binding with the screening molecule so as to identify a 
molecule capable of competitively binding the known target 
receptor . 

38. A method for identifying an unknown target receptor to 
which a molecule is capable of binding in a cell, 
comprising: 

(a) providing a screening molecule having a ligand which 
has a specificity for the unknown target receptor covalently 
bonded to a substrate capable of selectively binding to and 
selectively forming a covalent bond with a receptor; 

(b) introducing the screening molecule into a cell which 
expresses 

a first fusion protein of a DNA-binding domain fused 
to the unknown target receptor domain against which the 
candidate molecule is screened, 

a second fusion protein which comprises a receptor 
domain capable of binding to and forming a covalent bond 
with the screening molecule, and 

a reporter gene wherein expression of the reporter 
gene is conditioned on the proximity of the first fusion 
protein to the second fusion protein; 

(c) permitting the screening molecule to bind to the first 
fusion protein and to the second fusion protein so as to 
activate the expression of the reporter gene; 

(d) selecting which cell expresses the unknown target 
receptor; and 
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(e) identifying the unknown target receptor. 

39, The method of claim 38, wherein the unknown protein target 
is encoded by a DNA from the group consisting of 
genomicDNA, cDNA and synthetieDNA. 



'40. The method of claim 38, wherein the ligand has a known 
biological function. 
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41. A compound having the formula: 

H1-Y-H2 

wherein HI is Mtx or an analog thereof; 

wherein H2 is a substrate capable of binding to a receptor, 

and 

wherein Y is a moiety providing a covalent linkage between 
HI and H2, which may be present or absent, and when absent, HI 
is covalently linked to H2 . 

42. The compound of claim 41, having the formula: 




(D8M) 



4 3 . The compound of claim 41, having the formula: 




(D10M) 



44 o The compound of claim 41, having the formula: 
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45, The compound of claim 41, having the formula: 




(D7CM) 



46. The compound of claim 41, having the formula: 




«»«« H ecu ' 

(D8CM) 



47. A complex between the compound of claim 41 and a fusion 
protein which comprises a binding domain capable of binding 
to methotrexate, wherein HI of the compound binds to the 
binding domain of the fusion protein. 

48. The complex of claim 47, wherein the binding domain is that 
of the DHFR receptor. 

49. The complex of claim 47, wherein the fusion protein is 
DHFR-LexA. 

50. The complex of claim 47, wherein the fusion protein is 
DHFR-B4 2 . 

51. h cell comprising the complex of claim 47. 
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52. A method for screening a cDNA library by identifying the 
expressed protein target, comprising: 

(a) providing a screening molecule comprising a 
methotrexate moiety or an analog of methotrexate covalently 
bonded to a ligand which has a known specificity; 

(b) introducing the screening molecule into a cell which 
expresses a first fusion protein comprising a binding domain 
capable of binding methotrexate, a second fusion protein 
comprising the expressed unknown protein target, and a reporter 
gene wherein expression of the reporter gene is conditioned on 
the proximity of the first fusion protein to the second fusion 
protein; 

(c) permitting the screening molecule to bind to the first 
fusion protein and to the second fusion protein so as to 
activate the expression of the reporter gene; 

(d) selecting which cell expresses the reporter gene; and 

(e) identifying the unknown protein target and the 
corresponding cDNA, 

53. The method of claim 52, wherein the unknown protein target 
is encoded by a DNA from the group consisting of 
genomicDNA, cDNA and syntheticDNA. 

54. A new protein cloned by the method of claim 52. 
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FIGURE 15A FIGURE 15B FIGURE 15C 
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FIGURE 16A 
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FIGURE 17A 
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FIGURE 17B 
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